Epistasis can lead to fragmented neutral spaces and contingency in 
evolution 
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In evolution, the effects of a single deleterious mutation can sometimes be compensated for by a second mutation 
which recovers the original phenotype. Such epistatic interactions have implications for the structure of genome 
space - namely, that networks of genomes encoding the same phenotype may not be connected by single muta- 
tional moves. We use the folding of RNA sequences into secondary structures as a model genotype-phenotype 
map and explore the neutral spaces corresponding to networks of genotypes with the same phenotype. In most 
of these networks, we find that it is not possible to connect all genotypes to one another by single point muta- 
tions. Instead, a network for a phenotypic structure with n bonds typically fragments into at least 2" neutral 
components, often of similar size. While components of the same network generate the same phenotype, they 
show important variations in their properties, most strikingly in their evolvability and mutational robustness. This 
heterogeneity implies contingency in the evolutionary process. 
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I. INTRODUCTION 

The course of evolution is shaped by the complex interac- 
tion between random mutations that change genotypes and 
natural selection that acts on variation between phenotypes. 
Progress in evolutionary theory is thus predicated on gaining 
further understanding of the structure of genotype-phenotype 
(GP) maps (1). These mappings exhibit many non-trivial 
properties. For example, as emphasised by Kimura (2), many 
mutations are neutral - they do not appreciably change the 
phenotype or fitness - leading to a many-to-one redundancy 
in the transformation from genotypes to phenotypes that has 
profound consequences for evolution. In the neutral theory of 
evolution, genetic changes that are invisible to selection ^ 
can build up over time and may constitute the majority of mu- 
tations in an evolutionary lineage. Evidence for the abundance 
of neutral mutations can be found, for example, in homolo- 
gous proteins that differ in sequence, but perform the same or 
very similar tasks in different organisms 

Epistasis describes another important property of GP maps: 
the phenotypic effect of a genetic change at a single locus may 
depend on the values of other genetic loci. That such depen- 
dencies should exist is not at all surprising. Given the many 
multi-scale physical processes involved in translating a geno- 
type into a phenotype, it is rather the absence of epistasis that 
might be expected to be the exception to the rule. 

Recent advances in high throughput techniques and in 
bioinformatics have facilitated many new experimental stud- 
ies of epistasis. For example, Lunzer et al. (4) studied the leuB 
gene that codes for /3-isopropylmalate dehydrogenase in both 
E. coli and P. aeruginosa. These two homologous proteins 
differ at 168 positions, but when the mutations were imple- 



mented individually in E. coli, 63 of them were found to be 
individually deleterious, suggesting rampant epistasis, since 
their overall effect is neutral. Other recent studies have found 
large-scale epistasis in HIV-1 virus genes S|^, and in mito- 
chondrial transfer RNA from eukaryotes (7). These three ex- 
amples constitute only a very small snapshot of a much larger 
body of literature that suggests that epistasis is widespread 
throughout the living world (H;^. 

The ubiquity of epistasis also implies that neutral evolution 
can play a key role in facilitating the genotypic background 
that allows evolution to climb an adaptive peak (10): A set 
of mutations can be initially neutral, but when the environ- 
ment or the genotype changes, they may either be adaptive 
themselves, or bring a population closer to potential adaptive 
innovations. In other words, neutral evolution may enhance 
evolvability, the ability^ of an organism to facilitate heritable 
phenotypic changes (11). For example, in a recent paper Hay- 
den et al. ( 12) showed that allowing a population of ribozymes 
to accumulate neutral mutations greatly increased the popula- 
tion's ability to adapt to a new environment, and that this en- 
hanced evolvability could be traced to 'cryptic' variation that 
arose neutrally. 

In the context of evolution it is helpful to quantify epistasis 
in terms of the fitness that selection can act on' (^. Epistasis 
manifests in many different ways. In this paper we concen- 
trate on just two of these. Consider, for example, a simple two 
allele two locus system with alleles a or A at locus one, and 
b or B at locus two. If the transition from ab to AB increases 
fitness then sign epistasis (fist) describes the situation where 
either aB or Ab has a lower fitness than ab, whereas recip- 
rocal sign epistasis dTil) occurs if both intermediate genetic 
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states have a lower fitness than ab. Sign epistasis constrains 
the potential pathways that evolution can take towards high 
fitness phenotypes {15,1), whereas reciprocal sign epistasis is 
a necessary, but not sufficient, condition for peaks in fitness 
landscapes Cl6,). Even in this simple biallelic two locus sys- 
tem one can imagine other epistatic scenarios and 
the potential for complexity increases greatly as more genetic 
loci are considered. 

The considerations above frame the main question to be ad- 
dressed in this paper: If epistasis constrains the pathways of 
adaptive mutations, can it also constrain the potential for neu- 
tral mutations to facilitate adaptation? 

Although epistasis can have many different consequences 
for neutral evolution, in this paper we will in particular focus 
on the role of neutral reciprocal sign epistasis: Consider our 
biallelic system - if both ab and AB have the same fitness, 
but Ab and aB are unviable, then the only way to get directly 
from ah to AB is through double mutations. In this context, 
it is helpful to define neutral networks (NNs) dlTD : sets of 
genotypes that share the same phenotype. If we are in the 
regime of strong selection and weak mutation, the main case 
we consider in this paper, then double mutations will be very 
rare. One consequence of this reciprocal sign epistasis will be 
that an NN that contains ah and AB may be fragmented into 
separate neutral components (NCs). If the NN is fragmented 
into several NCs, this raises further questions like: Are these 
NCs homogeneous or heterogeneous? Does the potential for 
innovation depend on which NC a population finds itself in? 

There are many potential causes of neutral reciprocal sign 
epistasis. For example, any mechanism that resembles a lock 
and a key may need two compensatory mutations, one for 
the lock, and the other for the key, in order to restore func- 
tion. In his classic paper on compensatory mutations, Kimura 
(fish considered the case of two interacting amino acid sites 
for which a mutation in either amino acid is deleterious, but 
where a double mutation can restore the function. Although 
these two sites are physically close in the folded state, they 
may be far away along the protein backbone, and it is hard 
to be sure that a correlated set of mutations at other positions 
may not allow the two sites to change by single mutational 
states. Thus, just as is the case for fitness peaks (16), neu- 
tral reciprocal sign epistasis is a necessary, but not sufficient 
condition, for disconnected NNs. 

In this context, it is also important to remember that the 
GP map is typically characterised by very high dimensions, 
a property whose consequences have been of recent theoret- 
ical interest {T^-HTj). Briefly put, isolated fitness peaks are 
less likely to occur in high dimensional landscapes; instead, 
long neutral ridges feature much more prominently. NNs can 
be identified with these ridges, and by traversing these net- 
works, populations can explore large proportions of genotype 
space without having to cross fitness valleys. Similar argu- 
ments suggest that even when neutral reciprocal sign epis- 
tasis breaks a pathway between two genetic configurations, 
there may nevertheless be other pathways that connect up the 
NN. Thus an investigation of NCs necessitates either a fairly 
complete description of the GP map, or alternatively, a good 
enough understanding of local topology to ensure that an NN 



is disconnected. 

For these reasons we concentrate in this paper on a compu- 
tationally tractable and biologically motivated GP mapping. 
RNA strands can fold into well-defined three-dimensional 
structures driven by the specific bonding between AU, GU and 
GC base pairs, as well as stacking interactions between adja- 
cent bases. The RNA secondary structure describes the bond- 
ing pattern of a folded RNA strand of length L. There exist 
efficient and reliable algorithms that predict secondary struc- 
ture from primary sequence by minimising the free-energy. 
For the work pres ented here, we use the RNAf old program, 
version 1.8.4 (1221) . This system describes a map from a geno- 
type of length L to a phenotype that is characterised by the 
secondary structure. It has been extensively studied, generat- 
ing many important insights into evolutionary theory ( IJi |23|- 
26). The RNA map has the advantage that for modest values 
of L one can perform an exhaustive enumeration, and from 
this completely characterise the connectivity of the NNs (27^. 

The paper is organised as follows: In section|II]we establish 
that the NNs of most RNA secondary structure phenotypes are 
fragmented into disconnected NCs. We identify an important 
source of this fragmentation to be a particular kind of neutral 
reciprocal sign epistasis that arises from the biophysics of the 
GP map: Converting a pyrimidine-purine base-pair (e.g. GC) 
into a purine-pyrimidine pair (e.g. CG) in an RNA stem motif 
cannot proceed by single mutations without passing through 
an intermediate of a different structure. By exhaustive enu- 
meration of length L — 15 RNA sequences, we can study 
detailed properties of the NCs. We establish that many NNs 
can be split into multiple components with no particular NC 
being dominant. We also show that the fragmentation of these 
NNs will be sustained under crossover moves, implying that 
our results may be relevant for populations in both asexual and 
sexual regimes. 

We next examine some consequences of this fragmentation 
of NNs in section |III] We show that the size of a given NC 
component correlates with a measure of its robustness to ge- 
netic mutations. Since a typical NN is fragmented into mul- 
tiple NCs of different size, this implies that the robustness of 
a given population will depend on which NC it is on, and not 
only on its phenotype. Similarly, we find that the number of 
phenotypes accessible within one point mutation of the NCs, 
a measure of their evolvability (28j, varies significantly be- 
tween different NCs in a given NN. This heterogeneity leads 
us to conclude that the evolutionary fate of a population is 
contingent on the NC it occupies in genotype space. Finally, 
in section|IV]we discuss our main results, and look beyond the 
RNA secondary structure GP map to consider which conclu- 
sions may hold for a wider class of systems in including gene 
regulatory networks, proteins and the genetic code. 



II. RNA NEUTRAL NETWORKS ARE FRAGMENTED 

The structure of NNs in RNA have been extensively stud- 
ied previously (dill Hi). Here we briefly repeat some 
key results of this earlier work that are relevant for our inves- 
tigations (see also Electronic Supplementary Material (ESM), 
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FIG. 1 RNA neutral networks are split into many components. 

The number of NCs that make up the NN of a given structure is 
plotted against the size ranking of the structure, starting at 1 for the 
largest NN. The inset of the figure shows the consequence of allow- 
ing base pair exchanges as a fundamental evolutionary step. Data is 
for L = 15. 



Table[sB: 



The number of structures is much smaller than the num- 
ber of sequences 4^. The number of structures in- 
creases with L, but at a much slower rate than the num- 
ber of sequences. 

The distribution of NN sizes is heavily skewed, that is 
a minority of the phenotypes occupy a majority of the 
genotypes. For a given L, NNs with more than aver- 
age size are called large, and the corresponding sec- 
ondary structures are said to be frequent. While the 
absolute number of frequent structures increases with 
L, the fraction of sequences folding into frequent struc- 
tures goes up, while the fraction of NN's that are large 
goes down. 

The fraction of sequences that fold into the trivial struc- 
ture (that is the structure that has no bonds) decreases 
with L. 



The connectivity of NNs can be studied under the simplify- 
ing assumption that a network is made up of randomly chosen 
points on a genotypic hypercube. Analyses using graph the- 
ory then suggest that larger NNs are are likely to be fully con- 
nected, while small ones are likely to be fragmented (l2Ul30l) . 

For RNA secondary structure, however, it is important to 
also take the biophysics of bonding into account. In principle, 
each bond can be formed by one of six different nucleotide 
pairs: GC, CG, AU, UA, GU and UG. Point mutations can 
potentially connect GCo GU ^ AU and CG ^ UG o UA, 
but these two subspaces cannot be connected together by point 
mutations without breaking a bond. This type of neutral recip- 
rocal sign epistasis suggests that for a structure with n bonds 
we can expect on the order of 2" disjoint sets of compatible 



sequences^. This argument is independent of sequence length, 
and given that longer sequences may generate structures with 
more bonds, we expect the average number of NCs per NN to 
grow with L (see ESM, Table [S2]|. 

We therefore predict that virtually all NNs for RNA sec- 
ondary structure should be fragmented. By contrast, if double 
mutations (base-pair swaps) are allowed, then the results from 
random graph theory give a good estimate of the connectivity 
of an NN (l30h. But, in nature, base-pair swaps are expected to 
be very rare fF). While the fact that RNA secondary structure 
NNs are not fully connected has been widely acknowledged in 
the literature JztI; l29l : Isil) . the potential consequences of this 
fragmentation have not yet been fully explored. 

In order to determine the connectivity of the NNs, we start 
from a random sequence in the NN and follow all neutral mu- 
tations that can be accessed (22). Sequence space grows ex- 
ponentially with length L, so this exhaustive approach is only 
feasible for relatively short sequences; we will mainly present 
results for sequence length up to i = 15, but will also con- 
sider other lengths where appropriate. As has been done in 
many other studies (H^; l29l) . we ignore the trivial structure 
with no bonds^^. 

Our L = 15 system has 431 distinct secondary structures 
(at a folding temperature of 37°C) of which 86 or about 20% 
are large. The large structures cover 93% of the folding se- 
quences. By exhaustively searching through all of sequence 
space, we are able to identify all 12526 components, so that 
there are on average about 29 components per neutral net- 
work. Figure [1] shows how these are distributed among the 
different NNs. The largest number of components is 216 
for a relatively infrequent structure ranked 206th, and only 
a few small structures have a single NC (the largest has rank 
333). We summarise the data in ESM, Tables |S~l|S2| and Figs. 
IS2IS3I The NCs can be individually ordered, and are even 
more skewed than the NNs. Overall, 1120 NCs (less than 
10%) are larger than average, but together they cover 95% of 
non- trivial genotype space. 

By analogy to NNs we call an NC large if its size is more 
than the average in its NN. Most NNs contain several large 
NCs (see ESM, Fig. |S4t . Rather than being dominated by one 
NC, we observe that for most phenotypes there are many large 
NCs. The number of large NCs in an NN is strongly correlated 
with 2" where n is the number of bonds in the corresponding 
secondary structure (r = 0.74). In contrast, there is hardly 
any correlation between the number of large NCs and NN size 
(r = -0.01). 

In Figure |2] we show the size of the components for the 
largest 26 NNs that, together, cover over 50% of the folding 
genotypes. Most networks have more than the 2" compo- 



^ There may be other causes of fragmentation, and bonds vary in energy, so 
not all possible combinations will lead to the same secondary structure. 

' In systems where folded structures have an adaptive advantage, it is likely 
that the completely unfolded strand has very low fitness, and so can be ig- 
nored. There is also a practical reason for this choice. The trivial structure 
is much more frequent for small L than for large L, and so it could affect 
the applicability of our results for much longer structures. 
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FIG. 2 RNA NNs contain several large NCs. The 26 largest NNs 
shown here cover just over 50% of folding genotype space. For each 
NN, the sizes of all its NCs are shown. The shaded NC denotes the 
2"th NC (n is the number of bonds), and the thick black lines at 
the top of a stack indicate the existence of significantly smaller NCs. 
The 12 most abundant secondary structures are shown in ESM, Fig. 
[SHand Table[S3l 



nents we expect due to the biophysical argument given above; 
nonetheless the 2" largest NCs are generally very similar in 
size, and much larger than all the smaller NCs of the network. 
Note that the largest NC is for an NN that is ranked 12th by 
overall size, and more generally that the size of these largest 
26 NNs is not a reliable guide to the average size of the large 
NCs. 

If in addition to the point mutations, we also allow base-pair 
swaps, then the number of components drops significantly. 
In particular, as predicted by random graph theory ( 30), the 
majority of large NNs are dominated by a single giant NC. 
This big difference, caused by introducing base pair swaps, 
strongly suggests that the NN fragmentation we observe un- 
der point mutations arises from the simple neutral reciprocal 
sign epistasis mechanism we identified above. 

While single point mutations cannot connect up the NCs, 
one may consider whether crossover moves may do so. In that 
context it is helpful to consider Kimura's analogy to a lock- 
key system (Tisl): A change in the lock makes it necessary for 
the key to be changed accordingly. Crossing over one lock- 
key setup with another can only be successful in two cases. 
First, if the lock and key originate from different parents, suc- 
cessful offspring will arise only if the parents are compatible. 
Second the lock and key may originate from the same par- 
ent; this requires that crossover arise at special points in the 
sequence to ensure a matching lock and key. 

In RNA, the first case means that both parental sequences 
belong to the same NC and that the offspring consequently 
stay in that NC. The second case is possible only if the point 
of crossover is outside the looped region of the stem that is in- 
compatible in the parent sequences. This is illustrated in ESM, 
Fig. [S9] Under the second condition, crossover can put off- 
spring onto NCs that are distinct from either parent; however, 
such crossover only allows to explore a small subset of all 
possible NCs in an NN. Even this limited exploration is pred- 
icated on the population being distributed on multiple NCs 



in the first place, but this cannot be achieved without compen- 
satory mutations. It is worth noting that crossover slows down 
the rate of fixation of compensatory mutations (18). 

So far, we have shown that under fairly general condi- 
tions, the NNs of RNA secondary structure are fragmented 
into many NCs. This fact raises the following question: Are 
the different NCs similar or heterogenous in their properties? 

III. NEUTRAL COMPONENTS SHAPE EVOLUTIONARY 
TRAJECTORIES 

A. Robustness increases with component size 

The robustness to genetic change has been widely studied 
in the context of NNs. In particular, van Nimwegen et al. 
have shown how the robustness of an evolving population de- 
pends on the structure of the underlying neutral space dsTI) . 
While the dynamic properties of a population depend also on 
its size and mutation rate, we consider here only the effect of 
the structure of the NC. To this end, we define the mutational 
robustness of a genotype as the fraction of mutations that leave 
the phenotype unchanged. In analogy to (2§) we calculate the 
robustness of an NC by averaging the genotypic robustness 
of all genotypes in the NC. This measure gives the expected 
average robustness of a monomorphic population evolving on 
the NC(31). 

In agreement with earlier results based on sampling tech- 
niques (28) we observe a clear positive correlation between 
mutational robustness of an NC and its size (r = 0.47), as 
illustrated in Fig. [3] Hence, the larger the NC, the more likely 
individuals are to pass their phenotype on to their offspring 
after a random mutation. Given the large heterogeneity of NC 
sizes comprising a given NN, these results suggest that ro- 
bustness estimates based on the NN as a whole will not be 
representative of the robustness experienced by a population 
confined on a given NC. For example, if a population is re- 
stricted to a small component of a very large NN, the effec- 
tive mutational robustness will be (much) lower than that es- 
timated for the NN as a whole. 



B. Evolvability varies between components of the same 
phenotype 

The evolvability of a population is related to its ability to 
produce heritable phenotypic change (11). One might naively 
think that the more robust a phenotype is to mutations, the 
harder it is for mutations to generate novelty. However, this 
argument ignores the ability of neutral exploration to pave the 
way for future adaptive innovations (10). Wagner has pro- 
posed a proxy measure of evolvability that counts the number 
of phenotypes E-p that can be reached by a single mutation 
from a given NN (28). He showed that this measure also cor- 
relates positively with the size of an NN, and argued that phe- 
notypes with larger NN may be simultaneously more robust 
and more evolvable. But if the NNs are fragmented into sepa- 
rate NCs, then it is in fact the NC robustness and evolvability 
that matter to a population, and not the properties of the whole 
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FIG. 3 Robustness and evolvability increase with NC size. The 

evolvability counts the number of phenotypes that can be reached 
from an NC; robustness is the average probability that a muta- 
tion results in a genotype in the same NC. The projection into the 
robustness-evolvability plane illustrates their positive correlation. 



NN, which it cannot access by point mutations. Nevertheless, 
we find that the average robustness and evolvability of indi- 
vidual NCs are positively correlated (Fig. [3] ESM, Figs. ISIOI 
and IS 11 b just as was found for NNs. We can thus qualify's 
Wagner's result (28): Robustness and evolvability are not so 
much correlated at the level of phenotypes (NNs), but rather 
the correlation holds at the level of an NC (our results yield 
r — 0.81), which can vary strongly within one NN. 

Thus different populations with the same phenotype may 
exhibit significantly different evolvabilities. These differ- 
ences can be further quantified with the following definitions: 



E. 



U) 



E. 



(1) 



(2) 



where Ec is the set of phenotypes that can be reached by a sin- 
gle mutation from NC c. Thus the joint evolvability ^ counts 
the number of structures that can be reached from at least one 
NC in the NN, while the common evolvability (|2|i counts the 
structures available from all NCs. The comparison of these 
two properties reveals significant heterogeneity in the pheno- 
typic neighbourhoods of NCs in the same NN (see Fig. |4^). 
In fact the joint and common evolvability are only identical 
for those small NNs that are fully connected. For most NCs, 
a population will only be able to access a restricted subset of 
the entire NN's neighbouring phenotypes: Averaged over all 
NNs, F EE E^"'^ /£;(^) ^ 0.14. There is no significant correla- 
tion of F and NN size: r = 0.04, p = 0.41. 

We can further explore this heterogeneity by restricting our 
analysis to the large NCs only. We expect the differences 
to diminish because the joint evolvability has a lower bound 
given by the most evolvable NC while the common evolvabil- 
ity cannot be larger than for the least evolvable NC, which is 
typically very small. As Fig. |4j5 shows, the ratio of common 
to joint evolvability decreases when only large NCs taken into 
account: Fiarge = 0.37 with a weak correlation with NN size: 



r = 0.17, p = 4.5 X 10~^. We note that some phenotypes 
are only accessible from small NCs so the joint evolvability 
decreases slightly (on average by about 10%). In ESM, Fig. 
IS 121 we restrict the phenotypes further to just those that are 
large - the same general results hold. Finally, we can ask 
what fraction of the joint evolvability are accessible on av- 
erage from a single NC. If we consider only the large NCs 
of the frequent phe noty pes, this fraction is on average 76% 
(in agreement with (l29h). while averaging over all NCs in all 
NNs brings this down to 42% (see also ESM, Fig. [ST3] l. In- 
stead of requiring large NCs to be greater than the average 
NC in their NN, we also employed an entropy-based criterion 
and obtained qualitatively similar results {ESM, Sec. IS3.Cl and 

Fig.iini). 

It is important to consider whether this discrepancy is an 
artefact of the relative short sequences we study. Answer- 
ing this question by exhaustive enumeration is unfeasible. In- 
stead, we employed a sampling technique (ESM, Sec. IS4I ) for 
sequences of 20 nucleotides. We find that the heterogeneity 
between NCs becomes even more pronounced as the sequence 
length increases (ESM, Fig. IS15I ). 

Taken together, we have aiTived at a key result: the potential 
for future innovation does not only depend on the current phe- 
notype, but also on which NC a population occupies. The fact 
that different NCs provide access to different new phenotypes 
suggests a new mechanism for contingency in evolution. A 
dynamic setting in which this may be particularly important is 
a polymorphic population with genotypes from two (or more) 
NCs. If environmental changes are sufficiently rapid (that is 
faster than genetic drift), this could drive parts of the popula- 
tion to different phenotypes, potentially aiding diversification 
at the phenotype level ( 1211) . 



IV. DISCUSSION 

We have shown how neutral reciprocal sign epistasis in 
RNA leads to fragmentation of NNs into multiple compo- 
nents. For many of the NNs, no one component dominates. 
Moreover, the components are heterogeneous, so that differ- 
ent populations with the same phenotype, but different NCs, 
may show large variations in robustness and evolvability. 

These inferences were possible because of the tractability 
of the GP map between an RNA sequence and its secondary 
structure. An obvious question is whether our results extend 
to other maps. Boldhaus and Klemm (32|) studied a coarse- 
grained Boolean threshold dynamics model for the regu- 
latory network of the yeast cell cycle and identified nearly half 
a billion functional NCs, ranging in size between 6.1 x 10^^ 
and 4.4 x 10^^ genotypes. Interestingly, the wild type network 
is part of one of the smaller NCs. It contains networks which 
are quite sparse and noise-resilient, indicating that there are 
secondary aspects in the performance of the network which 
can be selected for. This example also shows heterogeneity in 
the properties of NCs. One caveat is that the point mutations 
were in an abstract space with discretised interactions. It is 
not yet clear how a more realistic model of mutations would 
affect the NCs. 
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FIG. 4 Phenotypic neighbourhoods are heterogeneous among NCs in the same NN. (a) Here the joint and common evolvability are shown 
considering all NCs of each NN. Square markers indicate fully connected NNs, for which joint and common evolvability trivially coincide, 
(b) Here only the large NCs in each NN were used for the calculations. Now square markers indicate NNs with only one large NC, for which 
again the equality is trivial. In both panels the black dashed line indicates the equality of joint and common evolvability. 



Recent experimental reconstructions of fitness landscapes 
may also open up avenues to study NCs. For example, in an 
important paper, Weinreich et al. (15) characterized all 32 
combinations of 5 mutations that together increase resistance 
to a particular antibiotic by a factor of about 10^. By measur- 
ing the resistance of each possible combination, they produced 
a phenotype landscape; in examining their data, we found that 
this landscape also contains several NCs (see ESM, Fig. |S161 l. 

There are two important caveats to this finding: First, the 
resistance scale used in the experiment is relatively coarse. 
Thus the neutrality in this landscape may even be broken by 
relatively small populations. In general, we stress that neutral- 
ity is always an effective statement, depending on population 
size (i) .Second, we cannot exclude the existence of neutral 
connections that were outside the scope of the experiment. 
Excluding such paths by exhaustively cataloguing all possi- 
ble mutations would be prohibitive. Progress can be made by 
studying the biophysics of a GP map, and looking for exam- 
ples of lock and key type systems. In proteins, binding sites 
may be potential candidates [l^. However, as reviewed by 
Poelwijk et al. ( 14), even lock and key systems can some- 
times evolve in subtle ways through single mutations. 

Another system to consider is the genetic code. It is in- 
teresting to note that with the exception of serine, all sets of 
codons coding for a particular amino acid can be reached by 
single synonymous point mutations. However, serine has two 
NCs, one made up of AGU and AGC, and the other of UCU, 
UCC, UCA and UCG. Given that serine often plays a key role 
in active sites in proteins, it may be that it cannot easily be 
neutrally replaced by another amino acid, so that these two 
NCs may indeed be separate in nature. It is noteworthy that 
this high NN connectivity is extremely unlikely to arise in a 
random genetic code with the same degeneracy as the uni- 
versal code (ESM, Fig. IS17b . As robustness coiTelates with 
NC (and not NN) size, this striking degree of NN connectivity 
may be a by-product of selection for other properties such as 
robustness of the genetic code to point mutations or transla- 
tion errors (26i) . 



We have focussed on the approximation of strong selection 
and weak mutations where double mutations are excluded. 
However, compensatory mutations can occur if the fitness 
penalties are weak, or if mutation rates are high. Measur- 
ing fitness is notoriously difficult, but a recent study of com- 
pensatory mutations in mitochondrial transfer RNA estimates 
that transitions from GC to AU may occur through low fitness 
GU and AC intermediates d?!). By contrast, switches like 
AUoUA, GCoCG and AUoCG, each of which requires a 
transversion, were found to be very rare. These results suggest 
that in nature we should expect fragmented neutral spaces in 
RNA to be common. 

Nevertheless, a sufficiently large population and/or high 
mutation rate can lead to a regime in which NCs are effec- 
tively connected, so that evolutionary dynamics may be less 
sensitive to the effects of NN fragmentation. In the opposite 
limit of small populations and/or mutation rates, the average 
spread of a population in genotype space can be much smaller 
than the size of many NCs, implying that the local NC struc- 
ture becomes more important. For a fixed mutation rate, there 
will thus be a crossover in the effect of NCs on evolutionary 
dynamics with increasing population size. More generally, the 
dependence of of evolvability and neutral space exploration 
on dynamic parameters is an important issue that we plan to 
address in a future publication. 

Our analysis has only considered the local phenotypic 
neighbourhood of individual NCs. Over longer evolution- 
ary timescales, populations evolve from one phenotype (and 
hence NC) to another and traverse the phenotypic landscape. 
In order to understand the importance of landscape structure 
on such long timescales, it is necessary to study not only ac- 
cessible phenotypes, but also the connectivity among NCs, 
which will be the focus of future studies. 

In conclusion then, we have focussed on one striking effect 
of epistasis on neutral evolution, namely the fragmentation of 
neutral spaces. The heterogeneity of the resultant NCs is im- 
portant both conceptually and in practice: Properties such as 
the robustness and evolvability of an evolving population may 



7 



not only depend on its phenotype, but also on which NC of 
that phenotype the population occupies. This sensitivity may 
lead to contingency in evolution: The evolutionary trajectory 
of a population depends not only on the occurrence of ran- 
dom mutations, but also on the possible innovations that are 
available to the NCs it happens upon. 
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S1. INTRODUCTION 

The state space for evolving individuals is genotype space. For DNA (or RNA) sequences of length L, genotype space contains 
4^ discrete points, each one corresponding to a unique sequence. Each point can be linked to 3i one -mutational neighbours 
which differ from the original genotype at only one nucleotide in the sequence. These connections between genotypes create a 
generalized hypercube in L-dimensional space. 

For the purpose of visualisation, we may think of the mapping from genotypes to phenotypes of as a colouring of the hyper- 
cube. Then all the vertices (genotypes) with the same colour (phenotype) make up a neutral network (NN). Neutral components 
(NCs) are sets of genotypes that are connected on the hypercube and share the same phenotype. 

It is hard to produce an accurate low-dimensional representation of the genotype hypercube. In Figure [ST] we show a sim- 
plified picture. It is intended to illustrate the existence of neutral networks and their components. In biologically relevant 
genotype spaces, the vertices have many more neighbours; in addition, there are no boundaries on the hypercube. 
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FIG. SI An illustration of genotype space. The picture shows a simplified genotype space. Each marker corresponds to a genotype, 
phenotypes are coded for by shape and colour. Solid lines indicate neutral mutations, black dashed connections are non-neutral mutations. 
This figure illustrates how neutral sets may be fragmented into separate neutral networks. Note how different networks for the same phenotype 
differ in what other phenotypes can be reached by single point mutations. In interpreting such pictures, it should be kept in mind that real 
genotype spaces have much higher dimensionality and no boundaries so that all nodes have the same number of neighbours. 
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S2. RNA NEUTRAL NETWORKS ARE FRAGMENTED 

Due to its computational tractability and biological relevance, the folding of RNA sequences into secondary structures is a 
widely studied GP map ( 17l: l23l425tl27l -l29 |) . In TablelSllwe provide data on some well-known characteristics of this map, and the 
scaling of these properties with sequence length L. In particular, the total number of different secondary structures ns increases 
exponentially with L. Yet this increase is slower than the expansion of sequence space as a whole (which grows as 4^), so that 
the average neutral network size {Vs) also increases with L. We compute this average with the trivial structure excluded: this 
structure is extremely frequent for the short sequences which we study, but it is clear that its abundance rtriv decreases quickly 
as L increases. 

Figure[S2]shows the distribution of NN sizes, by which we just mean the number of genotypes in the respective NN. The size 
distribution is strongly skewed: A few structures are frequent while most structures are rare. To be precise, we call a secondary 
structure frequent if its NN is larger than the average NN. Table IS II shows that the absolute number of frequent structures n freq 
increases with sequence length, while the fraction of structures that are frequent decreases for longer sequences. Nonetheless 
the fraction of genotypes that map to one of the frequent phenotypes rfreq grows with L. 
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TABLE S 1 Confirmation of well-linown results about RNA secondary structures. For sequence length L, the table lists the number of 
non-trivial structures ns, the fraction of SCC[UGnCCS Ttriv that fold into the trivial structure, the average neutral network size Vs, the number of 
frequent structures nfreq, and the proportion Vfreq of sequences in frequent structures to all sequences with a non-trivial structure. 




NN rank 

FIG. S2 The distribution of NN sizes is skewed. The figure shows the size of each NN against its rank, starting at rank 1 for the most 
frequent structure. The yellow line indicates the average NN size; only 86 NNs (that is, 20%) are larger than this. These large NNs contain 
93% of all folding sequences (cf. Table lSlb . 

In Table [S2l we show similar results, focusing on neutral components rather than networks. Just as the number of NNs (ns 
in Table [STJ the total number of NCs nc increases with sequence length. More interesting is the result that the average number 
of components per network nc /ns also increases with L. It is worth noting that (for the range of L we studied here) the mean 
number of components per network roughly doubles when 2 more bases are added to the sequence. Crudely speaking, two more 
bases allow to form an extra base pair. This rough argument then agrees nicely with our claim from the main text that the number 
of NCs can be expected to scale as 2" (where n is the number of base pairs in the structure). 

While the distribution of NN sizes is heterogeneous (cf. Figure |S2| ), the distribution of NC sizes shows an even more pro- 
nounced skew (see Fig. IS3I ). Again defining an NC to be large if it is greater than the average NC, the fraction of large NCs is 
smaller than the fraction of large NNs. Nonetheless, this smaller proportion of components contains an even larger fraction of 
genotypes than the large NNs. 

Overall, these global considerations indicate a strong heterogeneity in genotype space. Does this heterogeneity also exist 
within individual NNs? In Figure[S4]we show the number of large NCs for all frequent NNs for L = 15; from now on, we call 
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TABLE S2 Overview of results for NCs. For sequence length L, the table lists the number of NCs nc, the mean number of NCs per NN, 

the average NC size (Vc), the number of large NCs nirg, the fraction of large NCs, and the fraction of sequence space occupied by large NCs, 
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FIG. S3 The distribution of NC sizes is even more skewed tlian for NNs. Here the NCs are ranked by size, again starting at rank 1 for 
the largest NC. The yellow line indicates the average size. 1120 NCs (less than 10%) are larger than average, but together they cover 95% of 
non-trivial genotype space (see Table lS2l l. 



an NC large if it is greater than the average NC in its NN. Almost all frequent NNs contain several large NCs, only the ones 
ranked 73rd, 74th and 77th are dominated by a single large NC. Overall, for L = 15 there are 56 NNs with only a single large 
NC. 19 of these NNs are fully connected. It is clear that all NNs are dominated by their large NCs (Figure ISSTl. 
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FIG. S4 Most NNs contain several large NCs. An NC is called large if its size is at least the average NC size of its NN. The figure shows 
number of large NCs for the frequent structures of L = 15. Only 3 of them contain a single large NC. When the rare structures are also 
counted, 56 out of 431 NNs contain only one large NC. 

Do larger NNs generally have larger NCs? Figure [S6] shows that this is not strictly the case. Of course, the possible NC size 
is limited by the size of the entire NN. However, the number of NCs in an NN is not strongly coiTelated with the size of the 
NN (see main text. Figure 1), but with the number of bonds in the coiTesponding structure. Therefore, NNs of similar size can 
have quite different numbers of NCs. Thus the average NC size is not a reliable indicator of the size of the coiTesponding NN. 
Additionally, the overall spread in NC sizes in each NN can be very large (Figure [STT i. 

Due to the strong heterogeneity in the number of genotypes per phenotype, the largest NNs are going to dominate genotype 
space. In Figure [S8l we show the 12 most abundant structures; Table [S3] lists some of their properties. In particular, the number 
of (large) components is identical (or close to) 2", where n is the number of base pairs in the structure. For some structures 
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FIG. S5 NNs are dominated by the large NCs. The x-axis gives tiie size of a NN and tlie y-axis marks thie combined size of all large NCs 
of that NN. The yellow line is a straight-line least-squares fit. 
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FIG. S6 The NN size does not reliably predict average NC size. As a consequence of the large variation in the absolute number of NCs in 
an NN, it is possible that a smaller NN has larger NCs on average. The yellow line is a straight line least squared fit to the data points. 
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FIG. S7 The scatter of NC sizes can be very large. Each point in the figure corresponds to a NN and marks the size of the largest and 
smallest NCs in that NN. The black dashed line indicates the equality of largest and smallest NC; only fully connected NNs fall onto this line. 
This result implies that the robustness of a phenotype can vary widely depending on which neutral network it corresponds to. 



(e.g. the second most abundant one), there are exactly the 2" NCs we expect due to base pair exchanges (see the main paper for 
details). 

This simple dependence of the number of components on the structure corresponding to the NN is striking. In fact, it can be 
linked to the percolation theory arguments given by Reidys (30). He considers random GP maps and derives a threshold for the 
average number of neutral neighbour genotypes in order for an NN to percolate. It is clear that the assumption of a random GP 
map cannot capture the neutral reciprocal sign epistasis that leads to NN fragmentation in RNA. To accommodate for this effect. 
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Reidys introduced another mutational move, namely base pair swaps: In addition to point mutations of individual bases, paired 
bases are allowed to change in synchrony and thereby maintain a bond. 

When we stick with the more restrictive definition of mutations as single nucleotide substitutions, it is still possible to appeal to 
the percolation theory arguments. However, we need to restrict the set of genotypes to consider. Specifically, given a secondary 
structure we need to fix for each bond whether the base pair is made in the order purine-pyrimidine, or vice-versa. Thus we take 
only a subset of all possible genotypes into account. Within this subset, we can expect the percolation argument to be applicable. 
So if a phenotype is sufficiently frequent (this notion is made precise in ^3d)), the genotypes mapping into that phenotype will 
percolate in each separate subspace. 

There are also biophysical reasons that may change our expectation of 2" NCs. In particular, our results rest on the assumption 
that a GC base pair can be transformed into an AU pair via a GU intermediate. Empirically, at our temperature of interest (37°C) 
the Vienna package (22) indicates that neutral intermediates exist, at least in the frequent structures. Nonetheless, it may well 
be that these intermediates are not neutral in reality. For example, Meer et al. (7) point out that GU intermediates of tRNAs 
may have a strong selective disadvantage. Our simple model in which neutrality is based only on identical secondary structure 
cannot capture these effects. However, in principle it would be possible to take other factors such as free energy and stability of 
the native fold into account. 
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FIG. S8 The most abundant secondary structures. For each rank, the structure is given in the dot-bracket notation and in a simple 
diagrammatic representation (22). The ribose backbone is drawn in grey and base pairs are indicated by black lines. The structures follow 
counter-clockwise the direction from 5' to 3' . 
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TABLE S3 Overview of the most frequent structure for L — 15. Structures are ranked by their size V (the number of genotypes mapping 
into them) starting at rank _R = 1 for the most frequent structure, n is the number of bonds in the structure (cf. Figure [S8}. Nc is the total 
number of NCs in the corresponding NN, and Nirg is the number of large NCs. r is the size ratio of the 2"th NC to the largest NC. 
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A. Crossover does not provide compensatory mutations 




FIG. S9 Illustration of cross-over, (a) Mismatches in the same stack break the stracture. (b) Mismatches in different stacks can lead to the 
discovery of a new component. In both cases, the red cross indicates the cross-over point. 

In our construction of the genotype space hypercube, we have only taken point mutations into account when we connected 
neighbouring genotypes. In nature, other mutational moves are also observed: Deletions or insertions correspond to omitting 
or adding bases during reproduction of the genotype. These mutations change the sequence length which makes them hard to 
discuss in our framework. 

There is one other kind of mutation that we can address, namely crossover. This mutational move is particularly relevant to 
sexually reproducing organisms in which the offspring receives part of its genotype from each parent"*. Can crossover connect 
separate NCs? 

The simplest case to study is when both parental genotypes belong to the same NC. Let us focus on a single base pair for 
simplicity - if any single pair is not maintained in the offspring phenotype, that phenotype is necessarily different from the 
parental one. For definiteness, we assume that the base pair of interest is made by a purine-pyrimidine pair in both parents, 
say GC in one and AU in the other^. Evidently, crossover between the parents will again result in a purine-pyrimidine pair. 
Depending on the parental base pairs, this could be GC, GU, AU or AC. The first three pairs are compatible and may thus lead 
to the parental phenotype. However, they will again be part of the same NC - a transition into a pyrimidine-purine pair (which 
would necessarily be part of a different NC) is not possible. Finally, an AC pair is incompatible and will lead to a different 
phenotype. 

The case of parental genotypes from different NCs is slightly more involved. Let us again consider a purine-pyrimidine pair 
in the first parent, but now a pyrimidine-purine pair in the other parent. We now need to distinguish two cases depending on 
the point of crossover. First, consider the case that the crossover point is within the stem-loop region enclosed by the base pair 
of interest (illustrated in Figure [S9b ). This means that the resulting genotypes will have either a purine-purine or a pyrimidine- 
pyrimidine pair, neither of which can form a bond. Thus the offspring phenotype is necessarily different from the parent. Second, 
the point of crossover may be outside the stem-loop region of interest. In that case, the base pair is left intact. If the parental 
phenotype contains two separate stem-loop regions, this scenario of crossover may lead to a new NC: Individually incompatible 
stems are collectively 'shuffled'. Yet this way of exploring different NCs is limited to reusing the already existing stems; new 
variants of individual stems cannot be achieved in this manner 

In summary, crossover alone cannot lead to new NCs. In order to create genotypes on an NC that is different from the parental 
NCs, it is necessary to cross genotypes from different NCs in special positions. However to arrive at a new NC in the first place 
still requires two mutations even if crossover is taken into account. 



Asexual organisms such as bacteria can also mutate under crossover by horizontal gene transfer 
^ Having a GC and a GU pair in the same NC is likely to occur only in frequent structures. For rare phenotypes, only GC may exist; this is not important to the 
argument 
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S3. NEUTRAL COMPONENTS SHAPE EVOLUTIONARY TRAJECTORIES 

In the main paper, Fig. 3 we show that the robustness to genetic change and the number of phenotypes in the one-mutant 
neighbourhood of an NC and its size are correlated. Here, we provide additional views of this data set, focusing on the correlation 
of robustness and NC size (Figure [STOb and on the correlation of evolvability and NC size (Figure lSTOb . 
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FIG. SIO NC robustness increases with size. The robustness of an NC is the average normalized connectivity of the genotypes in the NC; 
the standard deviation is always much smaller than the average, so that the average is a meaningful quantity (data not shown). The blue points 
are the actual data, the yellow line is a straight line least squared fit. 
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FIG. S 1 1 The number of accessible phenotypes increases with size. We measure evolvability of an NC by counting the number of distinct 
phenotypes that can be reached by a point mutation off some genotype in the NC. The blue points are the data, the yellow line is a linear fit to 
the log-log data. The black dashed line indicates the total number of structures (431 for L = 15). 



A. Common and joint evolvabilitlesfor large NNs 

Given the large skew in the size of the NNs, it is worth considering another question regarding joint and common evolvability: 
Can the large NNs be reached from each other? By only considering the large NCs of the large NNs, we account for 86 NNs 
and 78% of (non-trivial) genotype space. Alternatively, we can consider all NCs which are larger than the average NC size 
(calculated from all NCs, not just a particular structure). There are 143 NNs with at least one such NC, and all 1120 large NCs 
together cover 95% of all non-trivially folding genotypes. In both cases, we observe in Fig. IS 121 that the discrepancy between 
joint and common evolvability remains significant. 
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FIG. S12 Even large NCs are not homogeneously connected. Joint and common evolvability are calculated taking only large NCs into 
account. Blue markers correspond to including on the large NCs of the large NNs; yellow markers are for the case of all large NCs (see text 
for further explanation). Square markers indicate NNs with only a single large NC; the vertical dashed lines show the number of NNs for each 
calculation, which correspond to the maximum evolvability. The other dashed line shows the equality of the joint and common evolvabilities. 



B. Relative evolvability for individual NCs 

The common evolvability (as defined in the main text, Eqn. (3)) gives the number of phenotypes that can be reached from 
any NC in a given NN. This is often much less than the joint evolvability (main text, Eqn. (2)) indicating that there are many 
phenotypes that can only be reached from some, but not all NCs with a given phenotype. But this does not tell us how many 
phenotypes can be reached on average from a NC in a given network. To measure this quantity, we define the mean relative 
evolvability as the ratio of NC evolvability to NN (joint) evolvability, averaged over all NCs in the network. Alternatively, 
we can restrict the average to the large NCs only. In Figure IS13I we show that this fractional evolvability increases slightly 
with phenotype abundance, but remains clearly below unity. This means that to sample all phenotypes that are part of the joint 
evolvability, it is necessary for a population to jump between NCs. 

If we take the average over all NNs (cf. Figure IS 13b ), we find that the mean relative evolvability is around 42%; for the 
frequent NNs, this average is 59%. If we take only large NCs into account (Fig. IS 13b ). the average is 63% for all NNs and 76% 
for the frequent NNs. 
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FIG. S13 Relative evolvability of individual NCs. The relative evolvabiltiy of an NC is defined as the evolvability of that NC divided by the 
joint evolvablity of the NN to which the NC belongs, (a) Here we calculate the relative evolvability for each NN by averaging over all NCs. 
(b) Here the average is computed for the large NCs of each NN only, increasing the mean relative evolvability. In both panels, the vertical 
black dashed line marks the average NN size. 



C. Choosing large NCs according to entropy stresses the importance of fragmentation 



Our definition of what constitutes a large NC (namely it must be larger than the average NC its NN) is in analogy with we 
call a frequent phenotype (27). An alternative definition can be made in terms of the entropy of the distribution of NC sizes. If 
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we denote the relative size of NC i by fi we have ^ ■ /i = 1 where the sum is over the NCs of a given NN. The entropy of the 
distribution is then 

i 

If the NN in question were fragmented into N NCs of equal size, we would obtain fi = 1/N and S — log N. So exp(5) gives 
us an approximate number of NCs whose relative size is significant in the NN. So in order to determine the large NCs in a given 
NN, we calculate exp(S'), round to the nearest integer N and choose the N largest NCs of the NN. 

In general, the entropy requirement is less restrictive than choosing the average size as a threshold: For only 6 NNs (with ranks 
between 80 and 159) the number of large NC is reduced under the entropy criterion (by 1 NC each). For 58 NNs, the number of 
NCs is the same for both criteria. Thus there remain 367 NNs which have more large NCs by entropy than by average size, and 
the increase can be up to 5-fold (on average, it is 1.6-fold). In absolute terms, the entropy criterion produces 4 additional large 
NCs on average. 

Regarding the discrepancy between joint and common evolvability, it is clear that a larger number of NCs cannot have a smaller 
joint evolvability or a bigger common evolvability than a smaller set. So if we consider the joint and common evolvability of 
the large NCs in an NN, the entropy measure will - for almost all NCs - increase the gap between the two values. This is 
illustrated in Fig. IS14I The ratio of common to joint evolvability, when calculated for the NCs that are large according to 
the entropy criterion, is Fs ~ 0.25 when averaged over all NNs. Fs correlates with NN size: r = 0.28, p < 10"^ so the 
discrepancy between joint and common evolvability is less pronounced for large NNs. All these results are quantitatively similar 
and qualitatively consistent with the average size criterion for large NCs that we have adopted in the main paper 

Given the two different criteria for what constitutes a large NC, is one more appropriate than the other? One advantage of the 
entropy requirement is that it does not introduce a somewhat arbitrary, hard cutoff. Choosing the average NC size as a threshold 
means that it is practically impossible to find that all NCs are large - this would arise only if all NC had exactly the same size. It 
thus appears that the entropy requirement should be favoured. Clearly, this measure is more lenient in that on average it classifies 
more NCs as being large. As our main interest in this paper is to discern whether the fragmentation of NNs is important for 
evolutionary dynamics, we have chosen the more restrictive, average-based approach in the main paper This criterion on average 
produces less large NCs; thus the effect of fragmentation is less pronounced. Thus the more natural, entropy-based approach 
suggests that NN fragmentation could be even more severe than outlined in our paper. 
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FIG. S14 The distinction between joint and common evolvability is robust to the choice of large NCs. In analogy to Fig. |4j) in the main 
paper, the joint and common evolvability are calculated for the large NCs of each NN. However, in this figure the entropy criterion was used 
to determine the large NCs. 



18 



S4. SAMPLING AT i = 20 CONFIRMS OUR RESULTS 

As genotype space grows exponentially with sequence length, exhaustive enumeration becomes infeasible for longer (and thus 
biologically more interesting) sequences. In particular, to study the evolvability of NCs we need to explore them completely. 
For RNA NNs, we can exploit the fact that their fragmentation has a simple cause, namely base pair complementarity. Using 
this insight, it is relatively straightforward to obtain sequences with the same structure but from different NCs: The Vienna 
package IB) includes a function to inverse-fold structures with constraints. Thus we can fix the base pairs and then try to obtain 
sequences with the desired structure. Using these sequences the NCs of a given NN can be mapped out. The evolvability can be 
calculated exactly for each NC by simply keeping track of the phenotypes found by non-neutral mutations. 

There is no guarantee that this approach will find all NCs; clearly, we are more likely to discover sequences on the large 
NCs. For computational convenience, we have run the inversion algorithm on 100 sequences for each of the 6" configurations 
of paired bases (again, n is the number of base pairs), choosing the unpaired base uniformly at random for each attempt. We 
should thus be able to find NCs of sizes ranging over at least 2 orders of magnitude; in fact, this range has turned out much 
larger, getting up to 6 orders of magnitude. Details of the structures we sampled are given in Tab. [SH 

In order to demonstrate that we found most genotypes of an NN, we use the sampling algorithm by Jorg et al. (34) to estimate 
NN sizes. By comparing the estimated NN size to the number of genotypes found by our sampling approach, we find an 
indication whether the sampling approach has found all major NCs. 

It is important to be clear what it means that smaller NCs may not be found. From the definitions (Eqns. (1) and (2) in 
the main text) it is evident that any subset of NCs gives us a lower bound on the joint evolvability and an upper bound on the 
common evolvability. Therefore, if the sampling is incomplete, better results can only show that the discrepancy between i?'^' 
and E'^'^^ is greater than our results indicate. 

In general, we find that the discrepancy between joint and common evolvability is large (Fig. IS15b : on average, F = 
= 0.16 and Fiarge = 0.21. Therefore, the contin gency due to neutral space fragmentation will be important at 
biologically realistic sequence lengths. 
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TABLE S4 Details of the sampled structures. For each structure, its representation in the dot-bracket notation is given, together with the 
estimated and sampled sizes and the number of components. Large components are those that are greater than the average of the NCs that have 
been found - note that when small NCs are not found, this produces an over-estimate of the average, n is the number of bonds in the structure, 
and r is the ratio of the 2"th NC size to the largest NC. 
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FIG. S15 NC heterogeneity increases with sequence length. Shown are the joint and common evolvability (as defined in the main text) of 
the structures sampled, according to Tab. IS4I In (a) all NCs have been included while for (b) only NCs of more than average size have been 
used. 
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S5. NCS BEYOND RNA 

Establishing the fragmentation of a NN is challenging. If we cannot exploit simple biophysical principles (as in the case of 
RNA), or if we would like to study properties of NCs, we need to determine phenotypes for large portions of genotype space. 
Due to the vast numbers of genotypes even in small jystems, this is a demanding task. Experimental progress in this area is 
underway; a pioneering study by Weinreich et al. dlSi) has determined the antibiotic resistance due a particular /3-lactamase. 
The authors considered 5 mutations that jointly increase the resistance by about 100,000 fold and measured the resistance of 
all intermediate mutants between the wildtype and the most resistant type. In Figure [ST6l we show the resistance landscape 
that arises in this system. Within the resolution of the experiment, there are several NNs of genotypes that convey the same 
resistance. Some of these NNs contain multiple NCs; however, it cannot be ruled out that mutations outside the scope of the 
experiment connect these NCs. 




Mutations from Wildtype 



FIG. S 16 Experimental evidence for NCs. The figures shows the genotype landscape investigated by Weinreich et al. Neutral mutations 
are indicated by solid lines, non-neutral mutations are dashed. The horizontal position of each marker indicates its mutational distance from 
the wild-type, and its shape and colour indicate its fitness. 

Another system in which the mapping from genotype to phenotype is well characterized is the genetic code, linking triplets of 
nucleotides in DNA (or mRNA) to amino acids in proteins'". Here, a neutral network consists of all the codons that are translated 
into the same amino acid (we treat the STOP signal as another amino acid). In the universal genetic there are 21 NNs with sizes 
between 6 (arginine, leucine, serine) and 1 (methionine, tryptophan). Serine is the only amino acid whose NN is fragmented. 
This NN contains NCs which have size 2 and 4, respectively. So in total, there are 22 NCs in 21 NNs. 

It is well known that the universal genetic code is significantly different from an arbitrary assignments of codons to amino 
acids (26). Here, we are interested to discern if the NN connectivity observed in the code is another property that sets the 
universal code apart from a random alternative. To this end, we generated 4x10^ codes by assigning each codon a randomly 
chosen amino acid, such that the degeneracy of the universal code is maintained. For each realization, we counted the overall 
number of NCs. A histogram of the data is shown in Figure [STT] On average, a random code contains 51 ± 3 NCs, much more 
than the 22 NCs of the universal code. The lowest number of NCs in our sample was 33 and this was realized only once. The 
maximum possible number of NCs, 64, was found in 23 random realizations of the code. 

It is clear that the degree of NN connectivity of the universal code is far from random. However, the universal code does not 
minimize the total number of NCs completely - for example, exchanging the 2 tyrosine codons with the smaller NC of serine 
would yield a maximally connected code. One possible explanation of the high connectivity of the universal code is that it 
confers robustness of the protein amino acid sequence to point mutations in the DNA and to translation errors (26). This finding 
illustrates an important message of the main paper: The robustness of a phenotype (the amino acid) cannot be ascribed to the 
properties of phenotype itself (such as the degeneracy of the amino acid), but is sensitive to the local connectivity of the NC. 



* In many other contexts, tlie sequence of residues in a protein is considered as its genotype. Fundamentally however, mutations change DNA, while biological 
function depends on the amino acid sequence. 
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FIG. S17 Random genetic codes show a high degree of NN fragmentation. 4 x lO'^ genetic codes were generated by assigning each codon 
a random amino acid, keeping the degeneracy of the universal code, and the number of NCs in each code were evaluated. The data has mean 
fi — 51.6 and standard deviation a = 3.0. The red hne shows a normal distribution with these parameters. The position of the universal 
genetic is indicated in blue. The smallest number of components found in the sampled data is 33. 



